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Consider the problem of simultaneously testing null hypotheses 
Hi , . . . , Hs . The usual approach to dealing with the multiplicity prob- 
lem is to restrict attention to procedures that control the familywise 
error rate (FWER), the probability of even one false rejection. In 
many applications, particularly if s is large, one might be willing to 
tolerate more than one false rejection provided the number of such 
cases is controlled, thereby increasing the ability of the procedure 
to detect false null hypotheses. This suggests replacing control of 
the FWER by controlling the probability of k or more false rejec- 
tions, which we call the A;-FWER. We derive both single-step and 
stepdown procedures that control the fc-FWER, without making any 
assumptions concerning the dependence structure of the p-values of 
the individual tests. In particular, we derive a stepdown procedure 
that is quite simple to apply, and prove that it cannot be improved 
without violation of control of the fc-FWER. We also consider the 
false discovery proportion (FDP) defined by the number of false re- 
jections divided by the total number of rejections (defined to be if 
there are no rejections). The false discovery rate proposed by Ben- 
jamini and Hochberg [J. Roy. Statist. Soc. Ser. B 57 (1995) 289-300] 
controls E{FDP). Here, we construct methods such that, for any 7 
and a, P{FDP > 7} < a. Two stepdown methods are proposed. The 
first holds under mild conditions on the dependence structure of p- 
values, while the second is more conservative but holds without any 
dependence assumptions. 

1. Introduction. In this paper, we will consider the general problem of 
simultaneously testing a finite number of null hypotheses Hi, i = 1, . . . , s. 
We shall assume that tests for the individual hypotheses are available and 
the problem is how to combine them into a simultaneous test procedure. 
The easiest approach is to disregard the multiplicity and simply test each 
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hypothesis at level a. However, with such a procedure the probability of 
one or more false rejections increases with s. When the number of true 
hypotheses is large, we shall be nearly certain to reject some of them. 

A classical approach to dealing with this problem is to restrict attention 
to procedures that control the probability of one or more false rejections. 
This probability is called the familywise error rate (FWER). Here the term 
"family" refers to the collection of hypotheses Hi, . . . ,Hs that is being con- 
sidered for joint testing. Which tests are to be treated jointly as a family 
depends on the situation. 

Once the family has been defined, control of the FWER (at joint level a) 
requires that 

(1) FWER < a 

for all possible constellations of true and false hypotheses. A quite broad 
treatment of methods that control the FWER is presented in [4]. 

Safeguards against false rejections are of course not the only concern of 
multiple testing procedures. Corresponding to the power of a single test, one 
must also consider the ability of a procedure to detect departures from the 
hypothesis when they do occur. When the number of tests is in the tens or 
hundreds of thousands, control of the FWER at conventional levels becomes 
so stringent that individual departures from the hypothesis have little chance 
of being detected. For this reason, we shall consider an alternative to the 
FWER that controls false rejections less severely and consequently provides 
better power. 

Specifically, we shall consider the A;-FWER, the probability of rejecting 
at least k true null hypotheses. Such an error rate with A: > 1 is appropriate 
when one is willing to tolerate one or more false rejections, provided the 
number of false rejections is controlled. 

More formally, suppose data X is available from some model P £ i}. 
A general hypothesis H can be viewed as a subset iv of O. For testing 
Hi: P £ LOi, i = 1, s, let I{P) denote the set of true null hypotheses when 
P is the true probability distribution; that is, i £ I{P) if and only if P G Wj. 
Then, the fc-FWER, which depends on P, is defined to be 

(2) A:-FWER = P{reject at least k hypotheses Hi with i G I{P)}. 
Control of the fc-FWER requires that /c-FWER < a for all P, that is, 

(3) Pjreject at least k hypotheses Hi with i G liP)} < a for all P. 

Evidently, the case k = 1 reduces to control of the usual FWER. 

We will also consider control of the false discovery proportion (FDP), 
defined as the total number of false rejections divided by the total number 
of rejections (and equal to if there are no rejections). Given a user specified 
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value 7 € (0, 1), the measure of error control we wish to control is P{FDP > 
7} and we derive methods where this is bounded by a. 

Recently, there has been a flurry of activity in finding methods that con- 
trol error rates that are less stringent than the FWER, which is no doubt 
inspired by the FDR controlling method of Benjamini and Hochberg [1] and 
applications such as genomic studies where s is so large that control of the 
FWER is too stringent. For example, Genovese and Wasserman [3] study 
asymptotic procedures that control the FDP (and the FDR) in the frame- 
work of a random effects mixture model. These ideas are extended in [9], 
where in the context of random fields the number of null hypotheses is un- 
countable. Korn, Troendle, McShane and Simon [8] provide methods that 
control both the /c-FWER and FDP; they provide some justification for their 
methods, but they are limited to a multivariate permutation model. Alter- 
native methods of control of the A;-FWER and FDP are given in van der 
Laan, Dudoit and Pollard [13]; they include both finite sample and asymp- 
totic results. Surprisingly, the methods presented here are distinct from the 
above techniques. Our methods are not asymptotic and hold under either 
mild or no assumptions, as long as p- values are available for testing each 
individual hypothesis. 

Before describing methods that provide control of the A;-FWER and FDP, 
we first recall the notion of a p-value, since multiple testing methods are 
often described by the p- values of the individual tests. Consider a single null 
hypothesis H '.P ^u>. Assume a family of tests of H, indexed by a, with 
level a rejection regions Sa satisfying 

(4) P{X eSa}<a for all < a < 1, P S w, 
and 

(5) Sa C Sa' whenever a < a . 
Then the p-value is defined by 

(6) p = p{X) = mi{a:X eSa}. 

The important property of a value that will be used later is the following. 



Lemma 1.1. Assume p is defined as above. 

(i) IfPeu, then 

(7) P{p <u}<u. 

(ii) Furthermore, 

(8) P{p <u}> P{X € 54. 

Therefore, if the Sa are such that equality holds in (4), then p is uniformly 
distributed on (0, 1) when P ^uj. 
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Proof. Assume P ^lo. To prove (i), note that the event {p < u\ imphes 
{X e Su+e} for any small e > 0. Therefore, 

P{p<u}<P{X(^Su+e}<u + e 

by assumption (4). Now let e ^ 0. To prove (ii), the event {X € Sy} implies 
{p < u}, and so (8) follows. □ 

Two classic procedures that control the FWER are the Bonferroni pro- 
cedure and the Holm procedure. The Bonferroni procedure rejects Hi if its 
corresponding p- value satisfies pi <a/s. Assuming pi satisfies 

(9) P{Pi ^u} for any u G (0, 1) and any P S Wj, 

the Bonferroni procedure provides strong control of the FWER. Unfortu- 
nately, the ability of the Bonferroni procedure to detect cases in which Hi 
is false will typically be very low since Hi is tested at level a/s which — 
particularly if s is large — is orders smaller than the conventional a levels. 

For this reason procedures are prized for which the levels of the individual 
tests are increased over a/s without an increase in the FWER. It turns out 
that such a procedure due to Holm [5] is available under the present minimal 
assumptions. 

The Holm procedure can conveniently be stated in terms of the p- values 
pi,. . . ,ps of the s individual tests. Let the ordered p- values be denoted by 
P{i) < • • • < P(s) 1 ^-^d the associated hypotheses by , . . . , H^^^-^ . Then the 
Holm procedure is defined stepwise as follows: 

Step 0. Let A; = 0. 

Step 1. If p[k+i) > a/{s — k), go to step 2. Otherwise set k = k + 1 and 
repeat step 1. 

Step 2. Reject H(^j-^ for j < k and accept Hf^j-j for j > k. 

The Bonferroni method is an example of a single-step procedure, meaning 
any null hypothesis is rejected if its corresponding p- value is less than or 
equal to a common cutoff value (which in the Bonferroni case is a/s). The 
Holm procedure is a special case of a class of stepdown procedures, which 
we now briefly describe. Let 

(10) ai < a2 < • ■ • < Us 

be constants. If > ai, reject no null hypotheses. Otherwise, if 

(11) p(i) <«!,..., p(r) < a,., 

reject hypotheses , . . . , where the largest r satisfying (11) is used. 
That is, a stepdown procedure starts with the most significant p- value and 
continues rejecting hypotheses as long as their corresponding p-values are 
small. The Holm procedure uses ai = a/{s — i + 1). 
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2. Control of the fc-FWER. The usual Bonferroni procedure compares 
each p- value pi with a/ s. Control of the fc-FWER allows one to increase a/ s 
to ka/s, and thereby greatly increase the ability to detect false hypotheses. 
That such a simple modification results in control of the fc-FWER is seen in 
the following result. 

Theorem 2.1. For testing Hi'.P^uji, i = 1, . . . ,s, suppose pi satisfies 
(9). Consider the procedure that rejects any Hi for which pi < ka/s. 

(i) This procedure controls the k-FWER, so that (3) holds. Equivalently, 
if each of the hypotheses is tested at level ka/s, then the k-FWER is con- 
trolled. 

(ii) For this procedure, the inequality (3) is sharp in the sense that there 
exists a joint distribution for (pi, . . . ,ps) for which equality is attained in 
(3). 

Proof, (i) Fix any P and suppose Hi with i ^ I = I{P) are true and 
the remainder false, with |/| denoting the cardinality of /. Let be the 
number of false rejections. Then, by Markov's inequality. 



iG/(P) 

k s 



Eka/s ,^,^.,a 



iG/(P) 

To prove (ii), consider the following construction. Pick fc indices at ran- 
dom without replacement from {!,..., s}. Call them J. Given i G J, let 
Pi = Ui, where Ui is uniform on {0,k/s), that is, Ui ~ U{0,k/s). Given 
i ^ J, let Pi = U2, where U2 is independent of Ui and U2 ^ U{k/s, 1). Then, 
unconditionally. 

Indeed, if n < fc/s, 

P{Pi <u} = P{i e J} • P{Ui <u} = ---^ = u 

s k/s 

and if u > fc/s, 

P{p, <u} = P{ieJ}-l + P{i i J} • P{C/2 < n} = - + f 1 - - V ^— ^ = u. 

s \ s J \ — k/s 

Now exactly fc of the pi are less than or equal to k/s by construction. The 
prob- ability that these are all less than or equal to ak/s is 

^ , afc 1 ak/s 
P{U^< — \ = — = a. 



^ s ) k/s 
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□ 

As is the case for the Bonferroni method, the above single-stage procedure 
can be strengthened by a Holm type of improvement. Consider the stepdown 
procedure described in (11), where now we specifically consider 

(12) ai = 



Of course, the ai depend on s and k, but we suppress this dependence in 
the notation. 

Theorem 2.2. For testing Hi: P £ uji, i = 1, . . . ,s, suppose pi satis- 
fies (9). The stepdown procedure described in (11) with Oi given by (12) 
controls the k-FWER, that is, (3) holds. 

Proof. Fix any P and let I{P) be the indices of the true null hypothe- 
ses. Assume |/(-P)| > A; or there is nothing to prove. Order the p- values 
corresponding to the |/(-P)| true null hypotheses; call them 

9(1) < ••• <g|/(p)|. 

Let j be the smallest (random) index satisfying pj-j) = qf^f.-^, so 

(13) k<j <s-\I{P)\ + k 

because the largest possible index j occurs when all the smallest p- values cor- 
respond to the s — \I{P)\ false null hypotheses and the next |/(-P)| p- values 
correspond to the true null hypotheses. So p^^ = q(^i^y Then our generalized 
Holm procedure commits at least k false rejections if and only if 

P{i)<ai, P(2)<0!2, P(j)<aj, 

which certainly implies that 

ka 

But by (13), 

ka ka 
s + k-j- \I{P)\- 
So the probability of at least k false rejections is bounded above by 

n f - ka } 

By Theorem 2.1(i) the chance that the kth. largest among I{P) p- values is 
less than or equal to ka/\I{P)\ is less than or equal to a. □ 



ka 
s 



ka 



s + k 



i <k, 



i > k. 
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Remark 2.1. Evidently, one can always reject the hypotheses corre- 
sponding to the smallest k — 1 p-values without violating control of the 
/c-FWER. However, it seems counterintuitive to consider a stepdown proce- 
dure whose corresponding are not monotone nondecreasing. In addition, 
automatic rejection of A; — 1 hypotheses, regardless of the data, appears at 
the very least a little too optimistic. To ensure monotonicity, our stepdown 
procedure uses ctj = ka/s. Even if we were to adopt the more optimistic 
strategy of always rejecting the hypotheses corresponding to the first k — 1 
hypotheses, we could still only reject k or more hypotheses if < ka/s, 
which is also true for the specific procedure of Theorem 2.2. 

Remark 2.2. If the p-values have discrete distributions, it is possible 
that there may be ties among them. However, the proof remains valid re- 
gardless of how tied p-values are ordered because monotonicity of the 
ensures that all hypotheses with a common tied p- value will be rejected if 
any of them are rejected. 

The question naturally arises whether it is possible to improve the pro- 
cedure further by increasing the critical values ai,a2, - ■ ■ without violating 
control of the A;-FWER (3). By the previous remark we can always increase 
ai to 1 for i < k. A more interesting question is whether we can increase 
for i>k. We will show that this is not possible by exhibiting for each i>k 
a joint distribution of the p- values for which 



Moreover, changing to Pi > results in the right-hand side being greater 
than a. Thus, with i>k, one cannot increase Qj without violating the k- 
FWER. Then, having picked ai, . . . , a^, . . . , a^-i, the largest possible choice 
for ai is as stated in the algorithm. 

Theorem 2.3. (i) Let the ai be given in (12). For any i>k there exists 
a joint distribution for pi,. . . ,ps such that s + k — i of the pi are uniformly 
distributed on (0, 1) and (14) holds. 

(ii) For testing Hi: P £ uji, i = 1, . . . , s, suppose pi satisfies (9). For the 
stepdown procedure (11) with ai given in (12), one cannot increase even one 
of the constants ai {for i>k) without violating the k-FWER. 

Before proving the theorem, we make use of the following lemma. 

Lemma 2.1. Fix k, u and constants < /3i < /32 < • • • < < Assume 
for every j = 2, . . . ,k, 



(14) 



P{P{i) < ai,P{2) < "2, • • • ,P{i-i) < ai_i,p(i) < ai} = a. 



(15) 



j{Pj-/3j-i) 



< 1. 
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Then there exists a joint distribution for {qi,...,qk) satisfying the qi are 
marginally uniform on (0, u) such that the ordered values q(^i^ < • • • < ^(fc) 
satisfy 

(16) <Pi,..., q^k) < Pk} = Pk/u. 

Proof. The proof is by induction on k. The result clearly holds for k = 
1. With probability (3k /u we will construct (qi, . . . , q^) equal to (gi, . . . , q^), 
where ~ C/(0, /9fc) for i = 1, . . . , A; and such that their ordered values < 
• • • < satisfy 

(17) P%) </3i,...w~(fc)</?fe} = l- 

But, with probability 1 — Pk/u, construct the qj to be conditionally dis- 
tributed as U {Pk,u). Then unconditionally the qj satisfy (16) and are marginally 
distributed as U{0,u). So it suffices to construct the qj satisfying qj ~ 
UiO,pk) and (17). 

Let /3o = and for i = 1, . . . ,k let Ei = and Pi = Pi — Pi-i- 

First construct Yi, . . . , each taking values in (0, such that their 

ordered values Y'(i) < • • • < ^(fe-i) satisfy 

(18) P{Y^i) <Pi,.. . < Pk-i} = 1 

and Yi is uniform on (0,/3fc_i]. This is possible by the inductive hypothesis, 
since we can assume the result holds for k — 1 as long as Pi,. . . ,Pk and u 
satisfy the stated conditions; in particular, we apply the result with u = Pk-i- 
Next, let Yk be uniform on Ei with probability 6pi for i = 1, . . . , A; — 1 and 
let it be uniform on Ek with probability 1 — OPk-i, where 9 satisfies 



(19) = -^ 

Pk~ 



^_k{Pk-pk^i) 



Pk 

Finally, let q'l, . . . , qk be a random permutation of 11,...,!^. Because of 
(18) and the fact that Yk < Pk, the ordered values of li, . . . , Yfc and hence 
the ordered values of gi, . . . , g/c satisfy (17). Furthermore, it is easy to check 
that qi falls in Ej with probability pj and so qi is U{0,Pk)- Indeed, if j < k, 
the probability that qi falls in Ej, conditional on qi not being equal to Yk, 
is pi/Pk-i and is 9pi in the latter case, which unconditionally is 

fe- 1 Pi , Ifl 
k Pk-i k 

and similarly for the probability that qi falls in Ek- The only detail that 
remains is to note that this construction with 9 defined in (19) is possible 
only if 9pi and 1 — 9Pk-i are all values in (0, 1). But 

k{Pk-Pk-i) 

Pk 
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which is certainly > since (3k > Pk-i - It is also < 1 by the assumption (15). 
Also, 



Pi 



f3k 



But the first factor pi/P^-i is in (0, 1) as is the latter by the above, and so 
the product is in (0, 1). □ 

Proof of Theorem 2.3. The case i = k follows from the construction 
in the proof of Theorem 2.1. Let the first i — koi the pj be identically equal 
to 0. (Actually, rather than point mass at 0, any distribution supported on 
[0, Q!i) will do.) For the remaining s' = s + k — i p-values pj , j = i — k + 1, . . . ,s, 
randomly choose k indices from i — k + 1, . . . ,s. The k that are chosen will 
be marginally U{0, k/s') and have a joint distribution which will be specified 
below; the remaining s — i can be taken to be distributed as U{k/s' , 1). 

Let qi,...,qk denote the k observations that are marginally U{0,k/s'). 
We need to specify the joint distribution of qi,. . . ,qk so that their ordered 
values < • • • < q^k) satisfy 

(20) ^ "i-fc+i> 9(2) < ai-k+2, q(k) <oii} = a 

(because = for j = 1,...,A;). So the problem reduces to con- 

structing a joint distribution for (gi,...,^^) satisfying (20) subject to the 
constraint that qj is marginally distributed as U{Q,k/s'). To do this, apply 
Lemma 2.1 with u = k/s' and f3j = ai-k+j- We need to verify the conditions 
of the lemma, which reduces to showing 

for i> k (and s and k fixed). But, if i — k + j — l<k, then the left-hand 
side of (21) is 0; otherwise it is easily seen to simplify to 

(22) 4 < 7 <k/s, 

^ ^ s + 2k-i- j - s + k- i - ' 

where the first inequality holds because i>k and the second because j <k. 
But k/s <1 and so the conditions of the lemma are satisfied. Therefore, we 
can conclude that the left-hand side of (20) is given by 

— - -a 
u k/s' ' 

and (i) is proved. 

To prove (ii), the construction used in (i) can be used even if Ui is replaced 
by ttj > ai, as long as such a switch still allows one to appeal to the lemma. 
However, the same argument works as long as Oi does not get bigger than 
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s/k ■ Ui, so that the argument leading to (22) being less than or equal to 
1 still applies. For such an cij, the argument for (i) then shows that, if the 
left-hand side of (14) has a, replaced by cai for some 1 < c < s/fc, then the 
right-hand side of (14) will be cct > a, which would violate control of the 
/c-FWER. □ 

3. Control of the false discovery proportion. The number k of false re- 
jections that one is willing to tolerate will often increase with the number 
of hypotheses rejected. So it might be of interest to control not the number 
of false rejections (sometimes called false discoveries) but the proportion of 
false discoveries. Specifically, let the false discovery proportion (FDP) be 
defined by 

Number of false rejections 



(23) FDP = < Total number of rejections' 



if the denominator 



is greater than 0, 
0, if there are no rejections. 

Thus FDP is the proportion of rejected hypotheses that are rejected er- 
roneously. When none of the hypotheses is rejected, both numerator and 
denominator of that proportion are 0; since in particular there are no false 
rejections, the FDP is then defined to be 0. 

Benjamini and Hochberg [1] proposed to replace control of the FWER by 
control of the false discovery rate (FDR), defined as 

(24) FDR = E{FDP). 

The FDR has gained wide acceptance in both theory and practice, largely 
because Benjamini and Hochberg proposed a simple stepup procedure to 
control the FDR. Unlike control of the /c-FWER, however, their procedure 
is not valid without assumptions on the dependence structure of the p- values. 
Their original paper assumed the very strong assumption of independence 
of p-values, but this has been weakened to include certain types of depen- 
dence; see [2]. In any case, control of the FDR does not prohibit the FDP 
from varying, even if its average value is bounded. Instead, we consider an 
alternative measure of control that guarantees the FDP is bounded, at least 
with prescribed probability. That is, for a given 7 and a in (0, 1), we require 

(25) P{FDP > 7} < a. 

To develop a stepdown procedure satisfying (25), let F denote the number 
of false rejections. At step i, having rejected i — 1 hypotheses, we want to 
guarantee F/i < 7, that is, F < ['yi\, where [xj is the greatest integer less 
than or equal to x. So, if A: = [7?] -I- 1, then F >k should have probability 
no greater than a; that is, we must control the number of false rejections to 
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be less than or equal to k. Therefore, we use the stepdown constant Oi with 
this choice of k (which now depends on i); that is, 

(26) Oi - 



s + [7iJ + 1 



We give two results that show the stepdown procedure with this choice 
of ai satisfies (25). Unfortunately, like FDR control, some assumptions on 
the dependence of p-values are required, at least by our method of proof. 
Later, we will modify the method so we can dispense with the dependence 
assumptions. As before, pi,...,ps denotes the p- values of the individual 
tests. Also, let denote the p-values corresponding to the \I\ = 

\I{P)\ true null hypotheses. So qi = Pj^, where ji, . . . correspond to the 
indices of the true null hypotheses. Also, let ri, . . . , ?^s_|/| denote the p- values 
of the false null hypotheses. Consider the following condition: for any i = 

(27) P{qi<u\fi,---,rs~\i\} <u; 

that is, conditional on the observed p-values of the false null hypotheses, 
a p-value corresponding to a true null hypothesis is (conditionally) domi- 
nated by the uniform distribution, as it is unconditionally in the sense of 
(7). No assumption is made regarding the unconditional (or conditional) 
dependence structure of the true p-values, nor is there made any explicit 
assumption regarding the joint structure of the p-values corresponding to 
false hypotheses, other than the basic assumption (27). So, for example, if 
the p-values corresponding to true null hypotheses are independent of the 
false ones, but have arbitrary joint dependence within the group of true null 
hypotheses, the above assumption holds. 



Theorem 3.1. Assume condition (27). Then the stepdown procedure 
with ai given by (26) controls the FDP in the sense of (25). 

Proof. Assume the number of true null hypotheses is |/(-P)| > (or 
there is nothing to prove) and the number of false null hypotheses is / = 
s — \I{P)\. The argument is conditional on the {r^}. Let 

r(i) < r(2) < • • • < r(^) 

denote the ordered values of the fj and similarly for the qi . Let ao = and 
define Ri to be the number of fj in the interval (ai_i,aj]. (Actually, assume 
Ri includes the value as well.) Given the values of ri, ■ ■ ■ ,rf, it may be 
impossible to have FDP > 7, that is, 

P{FDP>-f\fi,---,rf} = 0. 
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Otherwise, let j = j{ri, ■ ■ ■ ,ff) be defined as 

(28) J = min< m-.m — ^ i?j > mj > . 

To interpret this, given the p- values of the false hypotheses, j is the smallest 
critical index (depending only on the fj) where it is possible to have FDP > 
7, except whenever there are several p- values within an interval (aj_i, a^) we 
consider the index of the largest one. The point of the construction is that 
if the stepdown procedure stops at an index m < j, then m — J2i Ri/'m < 7 
and so FDP < 7. On the other hand, if the event FDP > 7 occurs, then 
there must be a rejection of a true null hypothesis at step j. 

For example, if s = 100, / = 5 and 7 = 0.1, then if all five of the fi are 
less than ai, then we define j = 6 even though the smallest true p- value 
could be the smallest among the 100. So the FDP could be greater than 0.1 
after the first step of the algorithm if (7(1) < f(i), but even if this is the case, 
we then know we will reject at least six total hypotheses. So the important 
point here is that, given such a configuration of {ri}, in order for FDP to be 
greater than 0.1, it must be the case that we reject a true null hypothesis 
at step 6. 

Note that, with j so defined, Rj = 0. For if J2i=i Ri= 3 — k with k/j>j 
and Rj > 0, then 

Y,Rt=j-k-Rj<j-l-k 
1=1 

and k/{j — 1) > 7, so that m = j — 1 satisfies the criterion. Furthermore, 
we also have J2i=i Ri = j — k (so not < j — k), where k/j > 7, because if 
J2i=i Ri<j — k<j — 1 — k say, then /c/(j — 1) > 7 if /c/j > 7 and so j can 
again be reduced to j — 1. 

In addition, at the index j it must be the case that 

j 

k = kU)=j-Y,Ri = '^ + l7j\- 

1=1 

But k > 7j implies k > + 1. But if k > [jj] + 1, then k — 1> [jj] + 1 
and so 

k-i . LtjJ + i ^ 

- — 7 > — — ^ >7, 

the last equality trivially following from 1 + [7jJ > 7J > 7(j — 1). 

We can now complete the argument. At the index j we must have k = 
j — J2i=i -Ri = 1 + LtjJ of the Qi being < Oj. But from Theorem 2.1 (applied 
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conditional on the fi), 

P{at least k{j) of the qi < aj\ri, . . . ,f/} 



|/|(L7iJ+l)a 



k{j) (s + LtjJ + 1 - j) s+ L7iJ + 1 - J 

But |I| < s — Ri = s — j + k, so the above probability is less than or 
equal to 

s — j + k 

j j -0 = 0. 

s + hj] + 1 - J 

Therefore, 

P{FDP>-/\fi,---,rf}<a, 
which of course implies P{FDP > 7} < a. □ 

Next, we prove the same stepdown procedure controls the FDP in the 
sense of (25) under an alternative assumption. Here, the assumption only 
involves the dependence of the p-values corresponding to true null hypothe- 
ses. 

Theorem 3.2. Consider testing s null hypotheses, with \I\ of them true. 
Let 5(1) < • • • < (?(|/|) denote their corresponding ordered p-values. Set M = 
min([7sj + 1, |I|). 

(i) For the stepdown procedure with ai given by (26), 
(29) P{FZ)P> 7} <p||j|g(,)<|^||. 

(ii) Therefore, if the joint distribution of the p-values of the true null 
hypotheses satisfies Simes inequality, that is, 

^ ^} U {^"(2) < ^} U • • • U < a}| < a, 

then P{FDP >-f} <a. 



Proof. Let j be the smallest (random) index where the FDP exceeds 
7 for the first time at step j; that is, the number of false rejections corre- 
sponding to the first j — 1 rejections divided by j exceeds 7 for the first time 
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at j. If j is such that 'jj <1, then FDP > 7 at step j imphes < aj. But 
this imphes 

a a 

9(1) < aj = —— : < — , 

s + 1 — J \1\ 

because the number of true null hypotheses \I\ necessarily satisfies \I\ < 
s — {j — 1) for such a j. 

Similarly, if j is such that 1 < 7j < 2, then we must have < Oj and 
P{j) < Oij for some i < j, where i, j correspond to true null hypotheses. But 
for such a j, aj = 2a/{s + 2 — j), and so we must have ^(2) < 2q/(s — j + 2). 
But, by definition of j, we must have |/| < s — (j — 2) and so ^(2) ^ 2a/\I\. 

Continuing in this way, if m — 1 < < m, the event FDP > 7 at step j 
implies (^(m) < ma/|I|. The largest value of j is of course s and so the largest 
possible m is [75] + 1. Also, we cannot have m> \I\. So, with M as in the 
statement of the theorem, 

M f . 
P{FBP > 7} < E ^ %n) < - 1 < 7j < m 

m=l I- Ml > 




<p 

Part (ii) follows trivially. □ 

In fact, there are many joint distributions of positively dependent vari- 
ables for which Simes inequality is known to hold. In particular, Sarkar and 
Chang [11] and Sarkar [10] have shown that the Simes inequality holds for 
the family of distributions which is characterized by the multivariate positive 
of order 2 condition, as well as some other important distributions. 

Theorem 3.2 points toward a method that controls the FDP without any 
dependence assumptions. One simply needs to bound the right-hand side of 
(29). In fact, Hommel [6] has shown that 



^|u{%)< 




This suggests we replace a by a(X)lii(l/0)"^- But of course |/| is unknown. 
So one possibility is to bound |/| by s, which then results in replacing a by 
a/Cg, where 

(30) C,=^(1A). 
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As is well known, Cs ~ log(s + 0.5) + Cei with <^e ~ 0.5772156649 known as 
Euler's constant. Clearly, changing a in this way is much too conservative 
and results in a much less powerful method. However, notice in (29) that we 
really only need to bound the union over M < [7s + Ij events. Therefore, 
we need to slightly generalize the inequality by Hommel [6], which is done 
in the following lemma. 



Lemma 3.1. Suppose pi, . . . ,pt are p-values in the sense that P{pi < u} < u 
for all i and u in (0,1). Let their ordered values be p^^i-^ < ••• <P{t)- Let 
= /3o < /?i < /?2 < • • • < /3m < 1 /or some m<t. 

(i) Then 

m 

(31) P{{p(i) < U {p(2) < /32} U • • • U {p(^) < f3m}} <tJ2i(^i - f3i.i)/i. 

i=l 

(ii) As long as the right-hand side of (31) is less than or equal to 1, 
the bound is sharp in the sense that there exists a joint distribution for the 
p-values for which the inequality is an equality. 



Proof. Let J be the smallest (random) index j among 1 < j <m for 
which < Pj] define J to be t + 1 if ]5(j) > /3j for all 1 < j < m. Let 
0fc = P{J = k}. Then the left-hand side of (31) is equal to 

{m ^ m 

k=l ) k=l 

since the events {J = k} are disjoint. We wish to bound J2k^k- For any 

1 < i < rn, 

j 

j2ji{j=k} = ji{j<j}<Sj, 

k=l 

where Sj is the number of p-values < Pj . Taking expectations yields 

j 

(32) J2''^k<tpj, j = l,...,m. 

k=i 

For j = 1, . . . , m — 1, multiply both sides of (32) by + 1)], and for 

j = m, multiply both sides by 1/m; then sum over j to yield 



(33) E T-Tn E kOk + -Y.ke,<Y: -TTT, + 
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By changing the order of summation, the left-hand side of (33) becomes 

k=i ^'^ ^k=i k=i 

The right-hand side of (33) is easily seen to be the right-hand side of (31) 
and (i) follows. 

To prove (ii), we construct pi, ... ,pt as follows. Let Ui be uniform in /j and 
let Um+i be uniform in {Pm, !)■ Let p be equal to the right-hand side of (31), 
assumed less than or equal to 1 . Let vri , . . . , TTm be probabilities summing 
to 1, with VTj oc (/3j — Then, with probability VTjp, randomly pick i 

indices and let those p- values be equal to Ui, and the remaining t — i p- values 
equal to Um+i ■ With the remaining probability 1 — p, let all p- values be equal 
to Um.Jri- With this construction it is easily checked that pi is uniform on 
(0, 1) and the left-hand side of (31) is equal to the right-hand side of (31). 
□ 



Theorem 3.2 and Lemma 3.1 now lead to the following result. 

Theorem 3.3. For testing Hi: P ^ coi, i = 1, . . . , s, suppose pi satisfies 
(9). Consider the stepdown procedure with constants a[ = ai/CQ^sj_|.i), where 
Ui is given by (26) and Cj is defined by (30). Then P{FDP > 7} < a. 

Proof. By Theorem 3.2(i), P{FDP > 7} is bounded by the right-hand 
side of (29) with a replaced by a/C^^sj+i, which is further bounded by the 
same expression with M replaced by [75] -|- 1. Then apply Lemma 3.1 with 
t=\I\ and Pi = ia/{Ci^si+i\I\)- D 

It is of interest to compare control of the FDP with control of the FDR. 
Some obvious connections between methods that control the FDP in the 
sense of (25) and methods that control its expected value, the FDR, can be 
made. Indeed, for any random variable X on [0, 1], we have 

E{X) = E{X\X < -f)P{X < 7} E{X\X > -f)P{X > 7} 

<7P{X <^} + P{X>j}, 

which leads to 

(34) M^tl<p{x>,)<S), 

1-7 7 

with the last inequality just Markov's inequality. Applying this to X = FDP, 
we see that, if a method controls the FDR at level q, then it controls the 
FDP in the sense P{FDP > 7} < g/7. Obviously, this is very crude because 
if q and 7 are both small, the ratio can be quite large. The first inequality 
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in (34) says that if the FDP is controhed in the sense of (25), then the FDR 
is controlled at level a{l — 7) + 7, which is greater than or equal to a but 
typically only slightly. These crude arguments suggest that control of the 
FDP is perhaps more stringent than control of the FDR. 

The comparison of actual methods, however, is complicated by the fact 
that the FDR controlling procedure of Benjamini and Hochberg [1] is a 
stepup procedure, but we have only considered stepdown procedures. It is 
interesting to note that, in order to make our procedure work without any 
dependence assumptions, we needed to change a to a/C^-ysj+i. Benjamini 
and Yekutieli [2] show that the Benjamini-Hochberg procedure that controls 
the FDR at level q can also work without dependence assumptions, if you 
replace qhy q jCg ■ Clearly, this is a more drastic change since Cs is typically 
much larger than C^^gj^.!. Such connections need to be explored more fully. 

4. Conclusions. We have seen that a very simple stepdown procedure is 
available to control the fc-FWER under absolutely no assumptions on the 
dependence structure of the p-values. Furthermore, control of the fc-FWER 
provides a measure of control for the actual number of false rejections, while 
the number of false rejections in the case of the FDR can vary widely. We 
have also considered two stepdown methods that control the FDP in the 
sense of (25). The first method provides control under very reasonable types 
of dependence assumptions, while the second holds in general. 
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After the revision and acceptance of this paper, we became aware of the 
work by Hommel and Hoffman [7] which has much overlap with the results 
in Section 2, and we'd like to thank Helmut Finner for pointing out this 
oversight. In particular, Hommel and Hoffman [7] provide Theorem 2.1(i) 
with proof, Theorem 2.2 (stated but no proof) and a weaker version of The- 
orem 2.3(ii) (stated but no proof). They attribute the idea of controlling the 
number of false hypotheses to Victor [14], who also suggested control of the 
FDP. However, Hommel and Hoffman did not further discuss control of the 
FDP as they "could not find suitable procedures satisfying this criterion." 
As far as we know, the three theorems in Section 3 which address control of 
the FDP are new. 
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